磁共振光谱成像(MRSI)是量化体内代谢物的必不可少的工具,但是低空间分辨率限制了其临床应用。基于深度学习的超分辨率方法为改善MRSI的空间分辨率提供了有希望的结果,但是与实验获得的高分辨率图像相比,超级分辨图像通常是模糊的。已经使用生成对抗网络进行了尝试,以提高图像视觉质量。在这项工作中,我们考虑了另一种类型的生成模型,即基于流的模型,与对抗网络相比,训练更稳定和可解释。具体而言,我们提出了一个基于流动的增强器网络,以提高超分辨率MRSI的视觉质量。与以前的基于流的模型不同,我们的增强器网络包含了来自其他图像模式(MRI)的解剖信息,并使用可学习的基础分布。此外,我们施加指南丢失和数据一致性丢失,以鼓励网络在保持高忠诚度的同时以高视觉质量生成图像。从25名高级神经胶质瘤患者获得的1H-MRSI数据集上进行的实验表明,我们的增强子网络的表现优于对抗网络和基线基线方法。我们的方法还允许视觉质量调整和不确定性估计。
translated by 谷歌翻译
磁共振光谱成像(MRSI)是研究人体代谢活动的宝贵工具,但目前的应用仅限于低空间分辨率。现有的基于深度学习的MRSI超分辨率方法需要培训一个单独的网络,为每个升级因素训练,这是耗时的,并且记忆力低下。我们使用过滤器缩放策略来解决这个多尺度的超分辨率问题,该级别的缩放策略根据升级因素调节卷积过滤器,以便可以将单个网络用于各种高尺度因素。观察每个代谢物具有不同的空间特征,我们还根据特定的代谢产物调节网络。此外,我们的网络基于对抗损失的重量,因此可以在单个网络中调整超级分辨代谢图的感知清晰度。我们使用新型的多条件模块结合了这些网络条件。实验是在15名高级神经胶质瘤患者的1H-MRSI数据集上进行的。结果表明,所提出的网络在多种多尺度超分辨率方法中实现了最佳性能,并且可以提供具有可调清晰度的超级分辨代谢图。
translated by 谷歌翻译
单光子发射计算机断层扫描(SPECT)是一种广泛应用的成像方法,用于诊断冠状动脉疾病。从计算机断层扫描(CT)得出的衰减图(U-MAP)用于衰减校正(AC),以提高心脏SPECT的诊断准确性。但是,SPECT和CT是在临床实践中依次获得的,这可能会导致两项扫描之间的误会。卷积神经网络(CNN)是医疗图像注册的强大工具。先前基于CNN的跨模式注册方法直接串联了两个输入模态作为早期特征融合或使用两个单独的CNN模块提取的图像特征,以进行晚期融合。这些方法不能完全提取或融合交叉模式信息。此外,以前尚未对心脏SPECT和CT衍生的U-MAP的深度学习刚性注册进行研究。在本文中,我们提出了一个双分支挤压融合 - 兴奋(DUSFE)模块,用于对心脏SPECT和CT衍生的U-MAP的注册。 Dusfe融合了从多种模态的知识,以重新校准每种模式的通道和空间特征。 Dusfe可以嵌入多个卷积层,以在不同的空间尺寸下实现特征融合。我们使用临床数据的研究表明,嵌入DUSFE的网络比以前的方法产生了较低的注册误差,因此更准确的AC SPECT图像。
translated by 谷歌翻译
最近已经为医疗图像分割任务创建了许多医疗数据集,并且自然质疑我们是否可以使用它们来依次训练(1)在所有这些数据集中表现更好的单个模型,并且(2)良好的概括和传输更好到未知的目标站点域。先前的工作通过在多站点数据集上共同训练一个模型来实现这一目标,该模型平均实现了竞争性能,但是这种方法依赖于所有培训数据的可用性的假设,从而限制了其在实际部署中的有效性。在本文中,我们提出了一个称为增量转移学习(ITL)的新型多站点分割框架,该框架以端到端的顺序方式从多站点数据集中学习模型。具体而言,“增量”是指顺序构建的数据集,而“转移”是通过利用每个数据集上嵌入功能的线性组合的有用信息来实现的。此外,我们介绍了ITL框架,在该框架中,我们在其中训练网络,包括具有预先训练的权重和最多两个分段解码器头的站点不合时宜的编码器。我们还设计了一种新型的站点级增量损失,以便在目标域上良好地概括。其次,我们首次表明利用我们的ITL培训计划能够减轻富有灾难性的遗忘问题,从而在渐进学习中遇到了挑战。我们使用五个具有挑战性的基准数据集进行实验,以验证我们的增量转移学习方法的有效性。我们的方法对计算资源和特定于领域的专业知识的假设最少,因此构成了多站点医学图像细分的强大起点。
translated by 谷歌翻译
We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. We use an autoregressive large language model (OpenAI's text-davinci-003) to determine if proposed U.S. Congressional bills are relevant to specific public companies and provide explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model, which outperforms the baseline of predicting the most common outcome of irrelevance. However, we test the ability to determine the relevance of a bill with the previous OpenAI GPT-3 model (text-davinci-002), which was state-of-the-art on many language tasks until text-davinci-003 was released on November 28, 2022. The performance of text-davinci-002 is worse than simply always predicting that a bill is irrelevant to a company. These results suggest that, as large language models continue to improve core natural language understanding capabilities, performance on corporate lobbying related tasks will continue to improve. We then discuss why this could be problematic for societal-AI alignment.
translated by 谷歌翻译
Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful representations. Existing approaches to posterior collapse often attribute it to the use of neural networks or optimization issues due to variational approximation. In this paper, we consider posterior collapse as a problem of latent variable non-identifiability. We prove that the posterior collapses if and only if the latent variables are non-identifiable in the generative model. This fact implies that posterior collapse is not a phenomenon specific to the use of flexible distributions or approximate inference. Rather, it can occur in classical probabilistic models even with exact inference, which we also demonstrate. Based on these results, we propose a class of latent-identifiable variational autoencoders, deep generative models which enforce identifiability without sacrificing flexibility. This model class resolves the problem of latent variable non-identifiability by leveraging bijective Brenier maps and parameterizing them with input convex neural networks, without special variational inference objectives or optimization tricks. Across synthetic and real datasets, latent-identifiable variational autoencoders outperform existing methods in mitigating posterior collapse and providing meaningful representations of the data.
translated by 谷歌翻译
We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.
translated by 谷歌翻译
In this paper we derive a PAC-Bayesian-Like error bound for a class of stochastic dynamical systems with inputs, namely, for linear time-invariant stochastic state-space models (stochastic LTI systems for short). This class of systems is widely used in control engineering and econometrics, in particular, they represent a special case of recurrent neural networks. In this paper we 1) formalize the learning problem for stochastic LTI systems with inputs, 2) derive a PAC-Bayesian-Like error bound for such systems, 3) discuss various consequences of this error bound.
translated by 谷歌翻译
We demonstrate how efficient autonomous drone swarms can be in detecting and tracking occluded targets in densely forested areas, such as lost people during search and rescue missions. Exploration and optimization of local viewing conditions, such as occlusion density and target view obliqueness, provide much faster and much more reliable results than previous, blind sampling strategies that are based on pre-defined waypoints. An adapted real-time particle swarm optimization and a new objective function are presented that are able to deal with dynamic and highly random through-foliage conditions. Synthetic aperture sensing is our fundamental sampling principle, and drone swarms are employed to approximate the optical signals of extremely wide and adaptable airborne lenses.
translated by 谷歌翻译
Generative AI has matured to a point where large-scale models can generate text that seems indistinguishable from human-written text and remarkably photorealistic images. Automatically measuring how close the distribution of generated data is to the target real data distribution is a key step in diagnosing existing models and developing better models. We present MAUVE, a family of comparison measures between pairs of distributions such as those encountered in the generative modeling of text or images. These scores are statistical summaries of divergence frontiers capturing two types of errors in generative modeling. We explore four approaches to statistically estimate these scores: vector quantization, non-parametric estimation, classifier-based estimation, and parametric Gaussian approximations. We provide statistical bounds for the vector quantization approach. Empirically, we find that the proposed scores paired with a range of $f$-divergences and statistical estimation methods can quantify the gaps between the distributions of human-written text and those of modern neural language models by correlating with human judgments and identifying known properties of the generated texts. We conclude the paper by demonstrating its applications to other AI domains and discussing practical recommendations.
translated by 谷歌翻译